Atlantic City
- North America > United States > Oregon (0.04)
- North America > United States > New Jersey > Atlantic County > Atlantic City (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Health & Medicine > Diagnostic Medicine (0.67)
- Law > Civil Rights & Constitutional Law (0.54)
- North America > United States > California > Los Angeles County > Long Beach (0.14)
- Asia > Afghanistan > Parwan Province > Charikar (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- (10 more...)
- Europe > Jersey (0.41)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.15)
- North America > United States > Virginia (0.05)
- (14 more...)
- North America > United States > Oregon (0.04)
- North America > United States > New Jersey > Atlantic County > Atlantic City (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Health & Medicine > Diagnostic Medicine (0.67)
- Health & Medicine > Therapeutic Area > Oncology (0.46)
- Europe > Spain > Galicia > Madrid (0.04)
- Oceania > Australia (0.04)
- North America > United States > North Carolina (0.04)
- (7 more...)
- North America > United States > California > Los Angeles County > Long Beach (0.14)
- Asia > Afghanistan > Parwan Province > Charikar (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- (12 more...)
VeFIA: An Efficient Inference Auditing Framework for Vertical Federated Collaborative Software
Huang, Chung-ju, Zhang, Ziqi, Wang, Yinggui, Wang, Binghui, Wei, Tao, Wang, Leye
Vertical Federated Learning (VFL) is a distributed AI software deployment mechanism for cross-silo collaboration without accessing participants' data. However, existing VFL work lacks a mechanism to audit the execution correctness of the inference software of the data party. To address this problem, we design a Vertical Federated Inference Auditing (VeFIA) framework. VeFIA helps the task party to audit whether the data party's inference software is executed as expected during large-scale inference without leaking the data privacy of the data party or introducing additional latency to the inference system. The core of VeFIA is that the task party can use the inference results from a framework with Trusted Execution Environments (TEE) and the coordinator to validate the correctness of the data party's computation results. VeFIA guarantees that, as long as the abnormal inference exceeds 5.4%, the task party can detect execution anomalies in the inference software with a probability of 99.99%, without incurring any additional online inference latency. VeFIA's random sampling validation achieves 100% positive predictive value, negative predictive value, and true positive rate in detecting abnormal inference. To the best of our knowledge, this is the first paper to discuss the correctness of inference software execution in VFL.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > District of Columbia > Washington (0.14)
- (37 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.68)
- Information Technology > Security & Privacy (1.00)
- Government (1.00)
- Banking & Finance (1.00)
Data-Agnostic Cardinality Learning from Imperfect Workloads
Wu, Peizhi, Kang, Rong, Zhang, Tieying, Chen, Jianjun, Marcus, Ryan, Ives, Zachary G.
Cardinality estimation (CardEst) is a critical aspect of query optimization. Traditionally, it leverages statistics built directly over the data. However, organizational policies (e.g., regulatory compliance) may restrict global data access. Fortunately, query-driven cardinality estimation can learn CardEst models using query workloads. However, existing query-driven models often require access to data or summaries for best performance, and they assume perfect training workloads with complete and balanced join templates (or join graphs). Such assumptions rarely hold in real-world scenarios, in which join templates are incomplete and imbalanced. We present GRASP, a data-agnostic cardinality learning system designed to work under these real-world constraints. GRASP's compositional design generalizes to unseen join templates and is robust to join template imbalance. It also introduces a new per-table CardEst model that handles value distribution shifts for range predicates, and a novel learned count sketch model that captures join correlations across base relations. Across three database instances, we demonstrate that GRASP consistently outperforms existing query-driven models on imperfect workloads, both in terms of estimation accuracy and query latency. Remarkably, GRASP achieves performance comparable to, or even surpassing, traditional approaches built over the underlying data on the complex CEB-IMDb-full benchmark -- despite operating without any data access and using only 10% of all possible join templates.
- North America > United States > Pennsylvania (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > New Jersey > Atlantic County > Atlantic City (0.04)
- (2 more...)
Outsourced Privacy-Preserving Feature Selection Based on Fully Homomorphic Encryption
Wakiyama, Koki, I, Tomohiro, Sakamoto, Hiroshi
Feature selection is a technique that extracts a meaningful subset from a set of features in training data. When the training data is large-scale, appropriate feature selection enables the removal of redundant features, which can improve generalization performance, accelerate the training process, and enhance the interpretability of the model. This study proposes a privacy-preserving computation model for feature selection. Generally, when the data owner and analyst are the same, there is no need to conceal the private information. However, when they are different parties or when multiple owners exist, an appropriate privacy-preserving framework is required. Although various private feature selection algorithms, they all require two or more computing parties and do not guarantee security in environments where no external party can be fully trusted. To address this issue, we propose the first outsourcing algorithm for feature selection using fully homomorphic encryption. Compared to a prior two-party algorithm, our result improves the time and space complexity O(kn^2) to O(kn log^3 n) and O(kn), where k and n denote the number of features and data samples, respectively. We also implemented the proposed algorithm and conducted comparative experiments with the naive one. The experimental result shows the efficiency of our method even with small datasets.
- Europe > United Kingdom > England > Greater London > London (0.04)
- North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
- North America > United States > New Jersey > Atlantic County > Atlantic City (0.04)
- (12 more...)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine (0.68)
Design and Analysis of an Extreme-Scale, High-Performance, and Modular Agent-Based Simulation Platform
Agent-based modeling is indispensable for studying complex systems across many domains. However, existing simulation platforms exhibit two major issues: performance and modularity. Low performance prevents simulations with a large number of agents, increases development time, limits parameter exploration, and raises computing costs. Inflexible software designs motivate modelers to create their own tools, diverting valuable resources. This dissertation introduces a novel simulation platform called BioDynaMo and its significant improvement, TeraAgent, to alleviate these challenges via three major works. First, we lay the platform's foundation by defining abstractions, establishing software infrastructure, and implementing a multitude of features for agent-based modeling. We demonstrate BioDynaMo's modularity through use cases in neuroscience, epidemiology, and oncology. We validate these models and show the simplicity of adding new functionality with few lines of code. Second, we perform a rigorous performance analysis and identify challenges for shared-memory parallelism. Provided solutions include an optimized grid for neighbor searching, mechanisms to reduce the memory access latency, and exploiting domain knowledge to omit unnecessary work. These improvements yield up to three orders of magnitude speedups, enabling simulations of 1.7 billion agents on a single server. Third, we present TeraAgent, a distributed simulation engine that allows scaling out the computation of one simulation to multiple servers. We identify and address server communication bottlenecks and implement solutions for serialization and delta encoding to accelerate and reduce data transfer. TeraAgent can simulate 500 billion agents and scales to 84096 CPU cores. BioDynaMo has been widely adopted, including a prize-winning radiotherapy simulation recognized as a top 10 breakthrough in physics in 2024.
- North America > United States > Texas > Travis County > Austin (0.13)
- Europe > Austria > Vienna (0.13)
- North America > United States > New York > New York County > New York City (0.04)
- (39 more...)
- Workflow (1.00)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- (2 more...)